Search CORE

424 research outputs found

Effective Genetic Risk Prediction Using Mixed Models

Author: Golan David
Rosset Saharon
Publication venue
Publication date: 01/01/2014
Field of study

To date, efforts to produce high-quality polygenic risk scores from genome-wide studies of common disease have focused on estimating and aggregating the effects of multiple SNPs. Here we propose a novel statistical approach for genetic risk prediction, based on random and mixed effects models. Our approach (termed GeRSI) circumvents the need to estimate the effect sizes of numerous SNPs by treating these effects as random, producing predictions which are consistently superior to current state of the art, as we demonstrate in extensive simulation. When applying GeRSI to seven phenotypes from the WTCCC study, we confirm that the use of random effects is most beneficial for diseases that are known to be highly polygenic: hypertension (HT) and bipolar disorder (BD). For HT, there are no significant associations in the WTCCC data. The best existing model yields an AUC of 54%, while GeRSI improves it to 59%. For BD, using GeRSI improves the AUC from 55% to 62%. For individuals ranked at the top 10% of BD risk predictions, using GeRSI substantially increases the BD relative risk from 1.4 to 2.5.Comment: main text: 14 pages, 3 figures. Supplementary text: 16 pages, 21 figure

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

A method for generating realistic correlation matrices

Author: Garcia Stephan Ramon
Golan David
Hardin Johanna
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

Simulating sample correlation matrices is important in many areas of statistics. Approaches such as generating Gaussian data and finding their sample correlation matrix or generating random uniform

[-1,1]

deviates as pairwise correlations both have drawbacks. We develop an algorithm for adding noise, in a highly controlled manner, to general correlation matrices. In many instances, our method yields results which are superior to those obtained by simply simulating Gaussian data. Moreover, we demonstrate how our general algorithm can be tailored to a number of different correlation models. Using our results with a few different applications, we show that simulating correlation matrices can help assess statistical methodology.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS638 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Scholarship@Claremont

Crossref

Lateral mobility of band 3 in the human erythrocyte ghost membrane

Author: Golan David Eric
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 01/01/1979
Field of study

Yale University

Computational intelligence in finance and economics [Guest Editorial]

Author: Duru Okan
Golan Robert
Quintana David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/10/2018
Field of study

Universidad Carlos III de Madrid e-Archivo

An Evolutionary Perspective of Animal MicroRNAs and Their Targets

Author: Golan David
Hornstein Eran
Shomron Noam
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2009
Field of study

MicroRNAs (miRNAs) are short noncoding RNAs that regulate gene expression through translational inhibition or mRNA degradation by binding to sequences on the target mRNA. miRNA regulation appears to be the most abundant mode of posttranscriptional regulation affecting ∼50% of the transcriptome. miRNA genes are often clustered and/or located in introns, and each targets a variable and often large number of mRNAs. Here we discuss the genomic architecture of animal miRNA genes and their evolving interaction with their target mRNAs

Crossref

PubMed Central

Measuring missing heritability: Inferring the contribution of common variants

Author: Golan David
Lander Eric S.
Rosset Saharon
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/06/2014
Field of study

Genome-wide association studies (GWASs), also called common variant association studies (CVASs), have uncovered thousands of genetic variants associated with hundreds of diseases. However, the variants that reach statistical significance typically explain only a small fraction of the heritability. One explanation for the “missing heritability” is that there are many additional disease-associated common variants whose effects are too small to detect with current sample sizes. It therefore is useful to have methods to quantify the heritability due to common variation, without having to identify all causal variants. Recent studies applied restricted maximum likelihood (REML) estimation to case–control studies for diseases. Here, we show that REML considerably underestimates the fraction of heritability due to common variation in this setting. The degree of underestimation increases with the rarity of disease, the heritability of the disease, and the size of the sample. Instead, we develop a general framework for heritability estimation, called phenotype correlation–genotype correlation (PCGC) regression, which generalizes the well-known Haseman–Elston regression method. We show that PCGC regression yields unbiased estimates. Applying PCGC regression to six diseases, we estimate the proportion of the phenotypic variance due to common variants to range from 25% to 56% and the proportion of heritability due to common variants from 41% to 68% (mean 60%). These results suggest that common variants may explain at least half the heritability for many diseases. PCGC regression also is readily applicable to other settings, including analyzing extreme-phenotype studies and adjusting for covariates such as sex, age, and population structure.National Institutes of Health (U.S.) (NIH HG003067)Broad Institute of MIT and Harvar

DSpace@MIT

PubMed Central

EST2Prot: Mapping EST sequences to proteins

Author: Lin David M
Shafer Paul
Yona Golan
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: EST libraries are used in various biological studies, from microarray experiments to proteomic and genetic screens. These libraries usually contain many uncharacterized ESTs that are typically ignored since they cannot be mapped to known genes. Consequently, new discoveries are possibly overlooked. RESULTS: We describe a system (EST2Prot) that uses multiple elements to map EST sequences to their corresponding protein products. EST2Prot uses UniGene clusters, substring analysis, information about protein coding regions in existing DNA sequences and protein database searches to detect protein products related to a query EST sequence. Gene Ontology terms, Swiss-Prot keywords, and protein similarity data are used to map the ESTs to functional descriptors. CONCLUSION: EST2Prot extends and significantly enriches the popular UniGene mapping by utilizing multiple relations between known biological entities. It produces a mapping between ESTs and proteins in real-time through a simple web-interface. The system is part of the Biozon database and is accessible at

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

Ghent University Academic Bibliography

PubMed Central

Archivsystem Ask23

Demystifying the Adversarial Robustness of Random Transformation Defenses

Author: Golan-Strieb Zachary
Sitawarin Chawin
Wagner David
Publication venue
Publication date: 01/01/2022
Field of study

Neural networks' lack of robustness against attacks raises concerns in security-sensitive settings such as autonomous vehicles. While many countermeasures may look promising, only a few withstand rigorous evaluation. Defenses using random transformations (RT) have shown impressive results, particularly BaRT (Raff et al., 2019) on ImageNet. However, this type of defense has not been rigorously evaluated, leaving its robustness properties poorly understood. Their stochastic properties make evaluation more challenging and render many proposed attacks on deterministic models inapplicable. First, we show that the BPDA attack (Athalye et al., 2018a) used in BaRT's evaluation is ineffective and likely overestimates its robustness. We then attempt to construct the strongest possible RT defense through the informed selection of transformations and Bayesian optimization for tuning their parameters. Furthermore, we create the strongest possible attack to evaluate our RT defense. Our new attack vastly outperforms the baseline, reducing the accuracy by 83% compared to the 19% reduction by the commonly used EoT attack (

4.3\times

improvement). Our result indicates that the RT defense on the Imagenette dataset (a ten-class subset of ImageNet) is not robust against adversarial examples. Extending the study further, we use our new attack to adversarially train RT defense (called AdvRT), resulting in a large robustness gain. Code is available at https://github.com/wagner-group/demystify-random-transform.Comment: ICML 2022 (short presentation), AAAI 2022 AdvML Workshop (best paper, oral presentation

arXiv.org e-Print Archive

eScholarship - University of California